A Threading-Based Method for the Prediction of DNA-Binding Proteins with Application to the Human Genome
نویسندگان
چکیده
Diverse mechanisms for DNA-protein recognition have been elucidated in numerous atomic complex structures from various protein families. These structural data provide an invaluable knowledge base not only for understanding DNA-protein interactions, but also for developing specialized methods that predict the DNA-binding function from protein structure. While such methods are useful, a major limitation is that they require an experimental structure of the target as input. To overcome this obstacle, we develop a threading-based method, DNA-Binding-Domain-Threader (DBD-Threader), for the prediction of DNA-binding domains and associated DNA-binding protein residues. Our method, which uses a template library composed of DNA-protein complex structures, requires only the target protein's sequence. In our approach, fold similarity and DNA-binding propensity are employed as two functional discriminating properties. In benchmark tests on 179 DNA-binding and 3,797 non-DNA-binding proteins, using templates whose sequence identity is less than 30% to the target, DBD-Threader achieves a sensitivity/precision of 56%/86%. This performance is considerably better than the standard sequence comparison method PSI-BLAST and is comparable to DBD-Hunter, which requires an experimental structure as input. Moreover, for over 70% of predicted DNA-binding domains, the backbone Root Mean Square Deviations (RMSDs) of the top-ranked structural models are within 6.5 A of their experimental structures, with their associated DNA-binding sites identified at satisfactory accuracy. Additionally, DBD-Threader correctly assigned the SCOP superfamily for most predicted domains. To demonstrate that DBD-Threader is useful for automatic function annotation on a large-scale, DBD-Threader was applied to 18,631 protein sequences from the human genome; 1,654 proteins are predicted to have DNA-binding function. Comparison with existing Gene Ontology (GO) annotations suggests that approximately 30% of our predictions are new. Finally, we present some interesting predictions in detail. In particular, it is estimated that approximately 20% of classic zinc finger domains play a functional role not related to direct DNA-binding.
منابع مشابه
Prediction of 3D protein Structure based on Mutation of AKAP3 and PLOD3 Gene in Case of Non-Obstructive Azoospermia
Background: The present study has been designed with the aim of evaluating A-kinase anchoring proteins 3 (AKAP3)and Procollagen-Lysine, 2-Oxoglutarate 5-Dioxygenase 3 (PLOD3) gene mutations and prediction of 3D proteinstructure for ligand binding activity in the cases of non-obstructive azoospermic male.Materials and Methods: Clinically diagnosed cases of non-obstructive azoos...
متن کاملIn silico identification of epitopes from house cat and dog proteins as peptide immunotherapy candidates based on human leukocyte antigen binding affinity
The objective of this descriptive study was to determine Felis domesticus (cat) and Canis familiaris (dog) protein epitopes that bind strongly to selected HLA class II alleles to identify synthetic vaccine candidate epitopes and to identify individuals/populations who are likely to respond to vaccines. FASTA amino acid sequences of experimentally validated allergenic proteins of house cat and d...
متن کاملIn silico investigation of lactoferrin protein characterizations for the prediction of anti-microbial properties
Lactoferrin (Lf) is an iron-binding multi-functional glycoprotein which has numerous physiological functions such as iron transportation, anti-microbial activity and immune response. In this study, different in silico approaches were exploited to investigate Lf protein properties in a number of mammalian species. Results showed that the iron-binding site, DNA and RNA-binding sites, signal pepti...
متن کاملPost-translational changes of histones, methylation level, and ERβ protein level in the cumulus cell genome of infertile women with endometriosis
Background: Endometriosis (which affects up to 50% of infertile women) is one of the major causes impacting female infertility. Endometriosis, defined as the presence of endometrial glands and stroma outside the uterine tissue, causes a wide range of functional disorders in the process of follicular development and changes in the follicular milieu, resulting in the formation of an incompetent o...
متن کاملEvaluation of First and Second Markov Chains Sensitivity and Specificity as Statistical Approach for Prediction of Sequences of Genes in Virus Double Strand DNA Genomes
Growing amount of information on biological sequences has made application of statistical approaches necessary for modeling and estimation of their functions. In this paper, sensitivity and specificity of the first and second Markov chains for prediction of genes was evaluated using the complete double stranded DNA virus. There were two approaches for prediction of each Markov Model parameter,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 5 شماره
صفحات -
تاریخ انتشار 2009